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Abstract 

If X is an n-element set, we call a family Q C VX a k- generator 
for X if every x C X can be expressed as a union of at most k disjoint 
sets in Q. Frein, Leveque and Sebo [TO] conjectured that for n > 2k, 
the smallest ^-generators for X are obtained by taking a partition of 
X into classes of sizes as equal as possible, and taking the union of the 
power-sets of the classes. We prove this conjecture for all sufficiently 
large n when k = 2, and for n a sufficiently large multiple of k when 
k > 3. 
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1 Introduction 

Let X be an n-element set, and let VX denote the set of all subsets of X. We 
call a family Q C VX a k- generator for X if every x C X can be expressed as 
a union of at most k disjoint sets in Q. For example, let (Vi)* =1 be a partition 
of X into k classes of sizes as equal as possible; then 

k 

i=i 
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is a fc-generator for X. We call a fc-generator of this form canonical. If 
n = qk + r, where < r < A;, then 

|JF n>fc | = [k - r){2 q - 1) + r(2 9+1 - 1) = (jfe + r)2" - jfe. 

Frein, Leveque and Sebo [10] conjectured that for any k < n, this is the 
smallest possible size of a /c-generator for X. 

Conjecture 1 (Frein, Leveque, Sebo). If X is an n-element set, k <n, and 
Q C VX is a k- generator for X , then \Q\ > \J- ni k\- Ifn > 2k, equality holds 
only if Q is a canonical k- generator for X . 

They proved this for k < n < 3k, but their methods do not seem to work for 
larger n. 

For k = 2, Conjectured] is a weakening of a conjecture of Erdos. We call 
a family Q C VX a k-base for X if every x C X can be expressed as a union 
of at most k (not necessarily disjoint) sets in Q. Erdos (see [UJ) made the 
following 

Conjecture 2 (Erdos). If X is an n-element set, and Q C VX is a 2-base 
forX, then \Q\ > {FnM- 

In fact, Frein, Leveque and Sebo [10] made the analogous conjecture for all 
jfe. 

Conjecture 3 (Frein, Leveque, Sebo). If X is an n-element set, k < n, and 
Q C VX is a k-base for X, then \Q\ > |.Fn,fc|- If n > 2k, equality holds only 
if Q is a canonical k-generator for X . 

Again, they were able to prove this for k <n < 3k. 

In this paper, we study /c-generators when n is large compared to k. Our 
main results are as follows. 

Theorem 4. Ifn is sufficiently large, X is an n-element set, and Q C VX 
is a 2- generator for X , then \Q\ > l-F^al- Equality holds only if Q is of the 
form F n>2 . 

Theorem 5. If k G N, n is a sufficiently large multiple of k, X is an n- 
element set, and Q is a k-generator for X , then \Q\ > l^n.fel- Equality holds 
only if Q is of the form T n ^. 

In other words, we prove Conjecture [T] for all sufficiently large n when 
k = 2, and for n a sufficiently large multiple of k when k > 3. We use some 
ideas of Alon and Frankl [1] , and also techniques of the first author from [5] , 
in which asymptotic results were obtained. 
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As noted in [TU], if Q C VX is a fc-generator (or even a fc-base) for X, 
then the number of ways of choosing at most k sets from Q is clearly at least 
the number of subsets of X. Therefore \Q\ k > 2 n , which immediately gives 

\g\ > 2 n/k . 

Moreover, if \Q\ — m, then 
Crudely, we have 



so 




fc-i 



Hence, if k is fixed, then 

(l + C(l/m))Q >2», 

so 

|0| > (A;!) 1 / fc 2 n / fc (l-o(l)). (2) 
Observe that if n = qk + r, where < r < fc, then 

l^fcl = (A: + r)2 g - k < (k + r)2 q = k2 n/ \l + r/k)2~ r/k < c k2 n/k , (3) 
where 

2 

Cn := —rr, — tt, = 1-061 (to 3 d.p.). 

Now for some preliminaries. We use the following standard notation. For 
n G N, [77] will denote the set {1,2,..., n}. If x and y are disjoint sets, we 
will sometimes write their union as x U y, rather than x U y, to emphasize 
the fact that the sets are disjoint. 

If k 6 N, and G is a graph, Kk{G) will denote the number of fc-cliques in 
G. Let T s (n) denote the s-partite Turan graph (the complete s-partite graph 
on n vertices with parts of sizes as equal as possible), and let t s (n) = e(T s (n)). 
For I G N, G\ will denote the cycle of length I. 

If F is a (labelled) graph on / vertices, with vertex-set {t>i, . . . , Vf} say, 
and t = (ti, . . . ,tf) G W, we define the t-blow-up of F, F <g) t, to be the 
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graph obtained by replacing t>j with an independent set Vi of size t i: and 
joining each vertex of Vi to each vertex of Vj whenever v(Oj is an edge of F. 
With slight abuse of notation, we will write F <S> t for the symmetric blow-up 
F<g> (*,...,*). 

If F and G are graphs, we write cf{G) for the number of injective graph 
homomorphisms from F to G, meaning injections from V(F) to V(G) which 
take edges of F to edges of G. The density of F in G is defined to be 

d F (G) Cf{G) 



\G\(\G\ - 1) ■ ■ ■ (\G\ - \F\ + iy 

i.e. the probability that a uniform random injective map from V(F) to V(G) 
is a graph homomorphism from F to G. Hence, when F = K^, the density 
of KkS in an n- vertex graph G is simply K k (G)/(fy. 

Although we will be interested in the density d F {G), it will sometimes be 
more convenient to work with the following closely related quantity, which 
behaves very nicely when we take blow-ups. We write Hohif(G) for the 
number of homomorphisms from F to G, and we define the homomorphism 
density of F in G to be 

_ Horn j?(C) 

~ \Q\\F\ ' 

i.e. the probability that a uniform random map from V(F) to V(G) is a 
graph homomorphism from F to G. 

Observe that if F is a graph on / vertices, and G is a graph on n vertices, 
then the number of homomorphisms from F to G which are not injections is 
clearly at most 



Hence, 



h G (F)nf - (>) 



n J . 



£ n(n-l)-(n-/ + l) 5 MF) " ^ (4) 
if / is fixed. In the other direction, 

d F {G) < n(n _ 1) ... in _ f + l) h F(G) < (1 + 0{l/n))h F {G) (5) 

if / is fixed. Hence, when working inside large graphs, we can pass freely 
between the density of a fixed graph F and its homomorphism density, with 
an 'error' of only 0(l/n). 

Finally, we will make frequent use of the AM/GM inequality: 
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Theorem 6. If Xi, . . . , x n > 0, then 




2 The case k \ n via extremal graph theory. 

For n a sufficiently large multiple of k, it turns out to be possible to prove 
Conjecture [T] using stability versions of Turan-type results. We will prove 
the following 

Theorem 5. If k G N, n is a sufficiently large multiple of k, X is an n- 
element set, and Q is a k- generator for X , then \Q\ > |.F ni fc|. Equality holds 
only if Q is of the form T n ^. 

We need a few more definitions. Let H denote the graph with vertex-set 
VX, where we join two subsets x,y C X if they are disjoint. With slight 
abuse of terminology, we call H the 'Kneser' graph on VX (although this 
usually means the analogous graph on X^). If J \Q C VX, we say that Q 
k- generates T if every set in J 7 is a disjoint union of at most k sets in Q. 

The main steps of the proof: First, we will show that for any A C VX 
with |.4.| > fi(2 n//fc ), the density of K^+iS in the induced subgraph H[A] is 
o(l). 

Secondly, we will observe that if n is a sufficiently large multiple of k, 
and Q C VX has size close to \T n j^\ and ^-generates almost all subsets of 
X, then Kk(H[Q]) is very close to Kk(Tk(\Q\)), the number of Kk's in the 
A;-partite Turan graph on \Q\ vertices. 

We will then prove that if G is any graph with small iffc+i-density, and 
with K k {G) close to K k (T k (\G\)), then G can be made fc-partite by removing 
a small number of edges. This can be seen as a (strengthened) variant of 
the Simonovits Stability Theorem [9] , which states that any ii'fc+i-free graph 
G with e(G) close to the maximum e(T k (\G\)), can be made fc-partite by 
removing a small number of edges. 

This will enable us to conclude that H[Q] can be made /c-partite by the 
removal of a small number of edges, and therefore the structure of H[Q\ is 
close to that of the Turan graph Tfc(|</|). This in turn will enable us to 
show that the structure of Q is close to that of a canonical fc-generator J- n>k 
(Proposition [9]). 

Finally, we will use a perturbation argument to show that if n is suffi- 
ciently large, and \Q\ < l-F^fcl, then Q = J- n> k, completing the proof. 
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In fact, we will first show that if A C VX with \A\ > Q(2 n / k ), then the 
homomorphism density of K^+i ®t in H[A] is o(l), provided t is sufficiently 
large depending on k. Hence, we will need the following (relatively well- 
known) lemma relating the homomorphism density of a graph to that of its 
blow-up. 

Lemma 7. Let F be a graph on f vertices, let t = (t l5 1 2? • • • ,tf) £ N% and 
let F <g> t denote the t-blow-up of F. If the homomorphism density of F in 
G is p, then the homomorphism density of F ® t in G is at least p* 1 * 2 '"*/ . 

Proof. This is a simple convexity argument, essentially that of [9]. It will 
suffice to prove the statement of the lemma when t = (1, . . . , 1, r) for some 
rGN. We think of F as a (labelled) graph on vertex set [/] = {1,2,...,/}, 
and G as a (labelled) graph on vertex set [n]. Define the function x '■ M — * 
{0, 1} by 

, < _ J 1 if i i— > v i is a homomorphism from F to G, 

o otherwise. 

Then we have 

h F (G) = ^j Y X(vi,---,v f ) =p. 

(vi,...,Vf)e[n]f 

The homomorphism density hF®(i,...,i, r )(G) of F <g) (1, . . . , 1, r) in G is: 
1 r 

fef»(l,...,l,r)(G) = nf _ 1+r E Y[x(vi,...,Vf^,vf) 

{v u ...,v f ^ 1 ,vf\vf\...,v l f } )e[n]f-i+r i=l 

(i'i,...,D / _i)G[n]/- 1 \ w/6[n] / 
(»i,...,t; / _ 1 )6[n]-''- 1 \ u/£[n] / / 

= I ^7 E x(vi,...,v f . 1 ,v f ) 

\ (vi,...,v f _ 1 ,v f )e[n]f J 

T 

= p . 

Here, the inequality follows from applying Jensen's Inequality to the con- 
vex function x i— > x r . This proves the lemma for t = (!,...,!, r). By 



n- 



> 
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symmetry, the statement of the lemma holds for all vectors of the form 
(1, . . . , 1, r, 1, . . . , 1). Clearly, we may obtain F <g) t from F by a sequence 
of blow-ups by these vectors, proving the lemma. □ 

The following lemma (a rephrasing of Lemma 4.2 in Alon and Frankl [1]) 
gives an upper bound on the homomorphism density of K^+i <S> t in large 
induced subgraphs of the Kneser graph H. 

Lemma 8. If A C VX with \A\ = m = 2^ s+1 ^ k+1 ^ n , then 

h Kk+ MH[A]) < (k + 1)2-^. 

Proof. We follow the proof of Alon and Frankl cited above. Choose (k + l)t 
members of A uniformly at random with replacement, {Af )i<i<fc+i, i<j<t- 
The homomorphism density of K^+i <8> t in H[A] is precisely the probability 
that the unions 

3=1 

are pairwise disjoint. If this event occurs, then \Ui\ < n/{k + 1) for some i. 
For each i e [fc], we have 



Pr{|^|<n/(fc + l)} = Pr ( |J f|{^ C 5} 

^ScX:|S|<n/(fc+l) \j=l 
t 



ml* 



< £ Pr flH^cS} 

|S|<n/(fc+l) Vi=l 
|S|<n/(fe+l) 

< 2 n (2 n /( fe+1 Vm)* 

_ 2~n(5t-l) 



Hence, 
fe 



Pr IJ^I - + *)> - £ Pr {N ^ n /(& + 1)} < (k + 1)2 



-n(5t-l) 



K i=l / i=l 

Therefore, 



^ fc+104 (iJ[^)<(fc + l)2-^- 1 ), 
as required. □ 
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From the trivial bound above, any fc-generator Q has \Q\> 2 n / k , so 5 > 
l/(k(k + 1)), and therefore, choosing t = t k := 2k[k + 1), we see that 

h Kk+imk (H[g})<(k + i)2- n . 

Hence, by Lemma 

h Kh+1 (H[Q])<O k (2-^). 

Therefore, by fl5j), 

d Kh+1 (H[0\)<O k (2-*rt) <2~ a " n (6) 

provided n is sufficiently large depending on k, where a k > depends only 
on k. 

Assume now that n is a multiple of k, so that IJ-^&I = k2 n l k — k. We will 
prove the following 'stability' result. 

Proposition 9. Let k G N be fixed. If n is a multiple of k, and Q C VX 

has \Q\ < (1 +77)|J r n,fc| and k-generates at least (1 — e)2 n subsets of X , then 
there exists an equipartition (5 , j)*L 1 of X such that 

\G H (U^VSi) | > (1 - C.e 1 ^ - D kV ^ k - 2-t* n )\F n>k \, 

where C k , D k , £ k > depend only on k. 

We first collect some results used in the proof. We will need the following 
theorem of Erdos [7j . 

Theorem 10 (Erdos). If r < k, and G is a K k+ \-free graph on n vertices, 
then 

K r (G) < K r (T k (n)). 

We will also need the following well-known lemma, which states that a 
dense fc-partite graph has an induced subgraph with high minimum degree. 

Lemma 11. Let G be an n-vertex, k-partite graph with 

e{G) > (1 - l/k-5)n 2 /2. 

Then there exists an induced subgraph G' C G with \G'\ — n' > (1 — y5)n 
and minimum degree 8{G') > (1 — 1/k — \f5)(n' — 1). 
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Proof. We perform the following algorithm to produce G'. Let G\ = G. 
Suppose that at stage i, we have a graph Gi onn-j + 1 vertices. If there is a 
vertex v of Gi with d(v) < (1 — 1/k — rj)(n — i), let Gj+i = Gi — v; otherwise, 
stop and set G' = Gi. Suppose the process terminates after j = an steps. 
Then we have removed at most 

(1 - 1/k - n) J> - = (1 " 1 A " V) (Q - ^ 2 J ) ) 
edges, and the remaining graph has at most 

edges. But our original graph had at least 

(1 - l/k-5)n 2 /2 

edges, and therefore 

(1 - 1/k - v )(l - (1 - a) 2 )n 2 /2 + (1 - a) 2 {l - l/k)n 2 /2 > (1 - 1/Jfe - 5)n 2 /2, 
so 

77(1 — a) 2 > 77 — 5. 

Choosing 77 = v^5, we obtain 

^(1-«) 2 >77(1-T7), 

and therefore 

(1 -«) 2 >l-?7, 

so 

a < 1 - (1 -r/0 1/2 < r/. 
Hence, our induced subgraph G" has order 

= n ' > (1 - v^)n, 

and minimum degree 

<5(G') > (1 — 1/fc — VS)(n'-l). 

□ 

We will also need Shearer's Entropy Lemma. 
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Lemma 12 (Shearer's Entropy Lemma, [4]). Let S be a finite set, and let 
A be an r-cover of S , meaning a collection of subsets of S such that every 
element of S is contained in at least r sets in A. Let J 7 be a collection of 
subsets of S . For A C S, let Ta = {F D A : F G J 7 } denote the projection 
of J 7 onto the set A. Then 

m r < n i^i- 

AeA 

In addition, we require two 'stability' versions of Turan-type results in 
extremal graph theory The first states that a graph with a very small iffc+i- 
density cannot have i^-density much higher than the /c-partite Turan graph 
on the same number of vertices, for any r < k. 

Lemma 13. Let r < k be integers. Then there exist C, D > such that for 
any a > 0, any n-vertex graph G with Kk + i-density at most a has K r -density 
at most 

k(k-l)---(k-r + l) 



k r 



-{l + Ca 1/{k+2) +D/n). 



Proof. We use a straightforward sampling argument. Let G be as in the 
statement of the lemma. Let be the number of /-subsets U C V(G) 
such that G[U] contains a copy of Kk+i, so that ( is simply the probability 
that a uniform random /-subset of V(G) contains a Kk+i- Simple counting 
(or the union bound) gives 

By Theorem [TDJ, each i£fc +1 -free G[U] contains at most 

k\ n 

r J \k 

K^s. Therefore, the density of K r 's in each such G[U] satisfies 

k(k-l)---(k-r + l) T 



d Kr (G[U}) < 



k r 1(1 -!)••• (I -r + 1) 



< * ( *- 1) "/- r+1) (1 + 0(1/0). (7) 

Note that one can choose a random r-set in graph G by first choosing a 
random /-set U, and then choosing a random r-subset of U. The density of 
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K r 's in G is simply the probability that a uniform random r-subset of V(G) 
induces a K r , and therefore 

d Kr (G)=Eu[d Kk (G[U])], 

where the expectation is taken over a uniform random choice of U. If U is 
-fTfc+i-free, which happens with probability 1 — (, we use the upper bound j7j); 
if U contains a Kk+i, which happens with probability (, we use the trivial 
bound d,K k {G[U]) < 1. We see that the density of K^s in G satisfies: 

d Kr (G) < (1 - c) fc(fc-l)- fcr (fc-r + l) (1 + 0(1//)) + c 

^ t - i) - tr (t - r+i) +0 a/o + ( t ; 1 > 

< fc(fc - 1) --; (fc - r + 1) + 0(1//) + l^a. 

Choosing I = mm{[a~ 1 ^ k+2 ^ \,n} proves the lemma. □ 

The second result states that an n- vertex graph with a small f^+i-density, 
a fTfc-density not too much less than that of 7fc(n), and a -K^-i- density not 
too much more than that of T^in), can be made into a fc-partite graph by 
the removal of only a small number of edges. 

Theorem 14. Let G be an n-vertex graph with K^ + i- density at most a, 
Kk-i- density at most 

and Kk-density at least 

(i-T)fj. 

where 7 < 1/2. T/ien G can 6e made mto a k-partite graph Gq by removing 
at most 

( n 8k k+1 (k + 1) r- , . \ fn\ 
(2/3 + 2 7 + 1 ^ + 2/c/nJ M 

edges, which removes at most 
i^fc 's. 
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Proof. If k G N, and G is a graph, let 

/C fc (G) = {S e V(G) (k) : G[S] is a clique} 

denote the set of all fc-sets that induce a clique in G. If S C V(G), let iV(S') 
denote the set of vertices of G joined to all vertices in S, i.e. the intersection 
of the neighbourhoods of the vertices in S, and let d(S) = \N(S)\. For 
S G K h {G), let 

= £ d(T). 

Tc5,|T|=fe-l 

We begin by sketching the proof. The fact that the ratio between the K k - 
density of G and the ^-i-density of G is very close to 1/k will imply that 
the average ~Ef G (S) over all sets S G K k (G) is not too far below n. The fact 
that the fT fc+ i-density of G is small will mean that for most sets S G IC k (G), 
every {k — l)-subset T C S has N(T) spanning few edges of G, and any two 
distinct (k - l)-subsets T,T' C S have \N{T) n iV(T')| small. Hence, if we 
pick such a set 5 which has fc(S) not too far below the average, the sets 
{N(T) : T C 5, \T\ = k — 1} will be almost pairwise disjoint, will cover most 
of the vertices of G, and will each span few edges of G. Small alterations will 
produce a /c-partition of V(G) with few edges of G within each class, proving 
the theorem. 

We now proceed with the proof. Observe that 

Sse/c fc (G) Src5,|T|=fc-i d{T) 



/Te/C fc _i(G) ' 



> 



K k (G) 

K k _ 1 (G)K k (G) 
{kK k {G)f 



K k -i(G)K k (G) 
= k 2 



2 K k {G) 



K k -\(G) 
> A; 2 (l- 7 ) 



2, n 1 ^ ' (I) 



fc*l + /3 k\ (^) 

= ^(n-* + l). 

(The first inequality follows from Cauchy-Schwarz, and the second from our 
assumptions on the -K^-density and the fT^i-density of G.) 
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We call a set T G /Cfe_i(G) dangerous if it is contained in at least 
V®{ n 2 +1 ) -^fe+i' s - Let -D denote the number of dangerous (k — l)-sets. 
Double-counting the number of times a (k — l)-set is contained in a K k +i, 
we obtain: 



2 y - v 2 y a u + i 



since there are at most K k+ iS in G. Hence, 



n 
k-l 



Similarly, we call a set S G K>k{G) treacherous if it is contained in at least 
\fa(n — k) Kk + iS. Double-counting the number of times a fc-set is contained 
in a Kk+i, we see that there are at most v^(fc) treacherous fc-sets. 

Call a set S G K,k{G) bad if it is treacherous, or contains at least one 
dangerous (k — l)-set; otherwise, call S good. Then the number of bad fc-sets 
is at most 

^(fc) +( n - A; + 1 )v / "( A; ^ 1 ) = (Hl)v^Q, 
so the fraction of sets in lC k (G) which are bad is at most 

(k + l)y/a _ k k (k + l)^ 
(l-7)# " (1-7)*! ' 

Suppose that 

max{|/ G (S)| : 5 is good} < (1 - ip)(n -k + 1). 
Observe that for any S G /Cfc(G), we have 

fa(S)<k(n-k + l), 
since < n - k + 1 for each T G S 1 -^. Hence, 

a contradiction if 

1-7 AWjfe + l)^ n 2k k+l (k + l) r- 
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Let S E /C fc (G) be a good fc-set such that f G (S) > (l-ip )(n-k + 1). Write 
S = {vi, . . . , Vk}, let Ti = S \ {vi} for each i, and let Ni = iV(Tj) for each i. 
Observe that NiDNj = N(S) for each i ^ j, and \N(S)\ = d(S) < y/a(n-k). 
Let Wi = Ni\N(S) for each i; observe that the W^'s are pairwise disjoint. 
Let 

R = V(G) \ utiWi 

be the set of 'leftover' vertices. 
Observe that 

k 

J2 \ N i \ N (S)\ = fa(S) - kN(S) > (1 - $)(n - k + 1) - AV^(n - k), 
i=i 

and therefore the number of leftover vertices satisfies 

\R\ < (ip + ky/a)n + k. 

We now produce a fc-partition (T^)f =1 of V(G) by extending the partition 
(Wi)i=i of V((j) \ i? arbitrarily to -R, i.e., we partition the leftover vertices 
arbitrarily. Now delete all edges of G within Vi for each i. The number of 
edges within Ni is precisely the number of Kk+iS containing T, which is at 
most y/a ( n ~2 +1 ) • The number of edges incident with R is trivially at most 
(ip + ky/a)n(n — 1) + k(n — 1). Hence, the number of edges deleted was at 
most 

{jp + k\fa)n(n — 1) + k{n — 1) + ky/a 

( n 8k k+1 (k+l) r- , . \ (n\ 
< I 2/3 + 2 7 + ± '-^ + 2k/n\ M . 

Removing an edge removes at most (^Zl) KkS, and therefore the total 
number of K k 's removed is at most 

( nr > n 8k k+l (k + l) nll \(n\(n-2\ 

completing the proof. □ 
Note that the two results above together imply the following 



n — k + 1 
2 
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Corollary 15. For any fceN, there exist constants A k , B k > such that the 
following holds. For any a >0, if G is an n-vertex graph with K k+ i- density 
at most a, and K k -density at least 

(l-T)p, 

where 7 < 1/2, i/jen G can be made into a k-partite graph Go by removing 
at most 

(2 7 + A k a^ + B k /n) Q 
edges, which removes at most 

(2 7 + ,W/(fc + 2) + Sfc/n ) 



k\ fn 
2 U 



Proof of Proposition^ Suppose Q C "PX has = m < (1 + rj)\J- n> h\, and 
/c-generates at least (1 — e)2 n subsets of X. Our aim is to show that Q is close 
to a canonical fc-generator. We may assume that e < 1/C* and 77 < 1/-Df, 
so by choosing and .D^ appropriately large, we may assume throughout 
that e and r\ are small. By choosing appropriately small, we may assume 
that n > no(k), where no(fc) is any function of fc. 

We first apply Lemma fT3l and Theorem [TH with G = H[Q], where H is 
the Kneser graph on VX, Q C VX with \Q\ = m < (1 + ^)|^ r n,fc|, and £ 
/c-generates at least (1 — e)2" subsets of X. By (jBJ), we have 



^ +1 (tf[<?]) < 2" 



a fc n 



and therefore we may take a = 2 afc ™. Applying Lemma [T3l with r = k — 1, 
we may take (3 = 2~ bfc " for some b k > 0. 

We have \Q\ = m < (1 + r])(k2 n ' k - k), so 

m\ m fc (l + ? ?) fe ^ fc n 

x jfeJ - IT < ifei ' 

Notice that 

I m 1 < fern*" 1 < fc((l + n)k2 n / k ) k ~ 1 < (1 + t?)*- 1 ^ 1 - 1 /*)". 

Since Q /c-generates at least (1 — e)2 n subsets of X, we have 
K k {H[g)) > (1 - e)2 n - (1 + r ] ) k - 1 k k 2^- 1 ^ n . 
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Hence, 

K k {H[Q\) 



d Kk {H[G\) 



> 



> 



(k) 

(1 - e)2 n - (1 + v f-ikk 2 {i-i/k)n 

1 - e - (1 + 77 )fe-ifc fe 2-"'/ fc jfe! 
(1 +r]) k 



>{l-e-k V -k k 2- n ' k ) w 



where the last inequality follows from 
1 - e 



( , , ^ > (1 - e)(l - tj)* > (1 - e)(l - k V ) > 1 - e - ^. 
Therefore, the -K^-density of satisfies 

A;' 

^W])>(l-7)p, 

where 

7 = e + kr) + A; fe 2-" /fc . 

Let 

By Theorem HU there exists a A;-partite subgraph Go of with 



K k (G )>K k (H[g])-iP 



Writing 



we have 



> (1 - e)T - (1 + rff-^k^-V^n _ ^ Q 
>(l-e - (1 + _ (i + ^k-i k h 2 -k/ n ) 2 n_ 



e + (1+ , r i )fcfc % + (1 + ^-^2-^, 
k\ 



K k (G ) > (1 - 0)2". 
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Let V±, . . . , Vfe be the vertex-classes of G . By the AM/GM inequality, 

A- 



and therefore 

|0| = m > fc(X fe (G )) 1/fe > fc(l - 0) 1/fc 2"/ fe , 

recovering the asymptotic result of [5]. 

Moreover, any &;-partite graph G satisfies 



To see this, simply apply Shearer's Entropy Lemma with 5* = V(Go), T = 
/C fe (Go), and A = {Vi U Vj : i ^ j}. Then .4. is a (k - l)-cover of V(G ). 
Note that Ty i \jy ] C _Eg (^; an d therefore 

{i,nm (2) 

Applying the AM/GM inequality gives: 
and therefore 

e(G )> Q)(AT fc (G )) 2/fc , 

as required. 

It follows that 

e(G ) > Q) C 1 - 0) 2/fc 2 2 " /fc 

> f*Vi-0) a /* / ' "' 



(1 + 7?)*;, 

> (1 - r]) 2 (l - (f)) 2/k (l -l/k)m 2 /2 

> (l-2r]-(f)^ k )(l-l/k)m 2 /2 
= {l-5){l-l/k)m 2 /2, 
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where 5 = 2i] + 2//fc . 

Hence, G is a fc-partite subgraph of H[Q] with |G | = \G\ = m, and 
e(Go) > (1 — 5 — l/k)m 2 /2. Applying Lemma [TT] to Go, we see that there 
exists an induced subgraph H' of Go with 

\h'\ > (i-Vs)\g\, (9) 

and 

8{H') > {l-l/k-VS)(\H'\-l). 

Let Yi, . . . , Yj. be the vertex-classes of H'; note that these are families of 
subsets of X. Clearly, for each % £ [A;], 

\Yi\ < \H'\-5{H') < (l/k + Vd)\H'\ + l. (10) 

Hence, for each i £ [k], 

\Yi\ > \H'\-(k-l){(l/k+VS)\H'\+l) > {l/k-(k-l)\VS)\H'\-k+l. (11) 

For each i £ [A;], let 

Si=\Jy 

yeY, 

be the union of all sets in Y^. We claim that the SVs are pairwise disjoint. 
Suppose for a contradiction that SiC\ S2 ^ 0- Then there exist yi £ Y\ and 
2/2 £ Y 2 which both contain some element p £ X. Since 

5{H') > (l-l/k-Vd)(\H'\-l), 

at least (1 — l/k — y/5)(\H'\ — 1) sets in U^iYj do not contain p. By ffTOl . 

I U^i Y;| = ^ \Yi\ < (l-l/k + (k-l)VS)\H'\ + k-l, 
#1 

and therefore the number of sets in U^iY* containing p is at most 

(1 - l/k + (k - 1)V5)\H'\ + k — l — (l — l/k — VS){\H'\ - 1) < fcV$|if'| + fc. 

The same holds for the number of sets in U^^i containing p, so the total 
number of sets in H' containing p is at most 

2kV6\H'\ +2k. 

Hence, the total number of sets in Q containing p is at most 

(2k + l)V5m + 2k. 
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But then the number of ways of choosing at most k disjoint sets in Q with 
one containing p is at most 

(1 + m k - l )({2k + l)V6m + Ik) = O k (V5)2 n + O k {2 (1 - 1/k)n ) < 2 n ~ l - e2 n , 

contradicting the fact that Q fc-generates all but e2 n of the sets containing p. 

Hence, we may conclude that the Si 8 are pairwise disjoint. By definition, 
Yi C VSi, and therefore \Y { \ < 2' s< l. But from ([II]), 



I>(1- 


k(k- 


l)V5)\H'\/k - k + l 


>(1- 


k{k- 


i)Vs)(i-Vs)\g\/k-k + i 


>(1- 


k{k- 


1)VS){1 ~ - <P) 1/k 2 n/k -k + l 


>(1- 


(k(k- 


- 1) + 1)V6 - <f) 1/k )2 n/k - k + 1 


>(1- 


k 2 V6 


- (f) l l k )2 nlk - k 


2^n/k- 


-i 





1 



using (jUj) and OH]) for the second and third inequalities respectively Hence, 
we must have \Si\ > n/k for each i, and therefore \Si\ = n/k for each i, i.e. 

1 is an equipartition of X. Putting everything together and recalling 
that 5 = 2r] + 2 / fc and (p = Ok(e + r] + 2~ Cfc "), we have 

k 

|<?n(utiPSOI>]Cl y *l 
i=i 

> (1 - k 2 VS - <p 1/k )k2 n/k - k 2 

> (1 - C k e 1/k - D k r] 1/k - 2-^ n )k2 n/k 

(provided n is sufficiently large depending on k), where C k ,D k ,£ k > depend 
only on k. This proves Proposition □ 

We now prove the following 

Proposition 16. Let v(n) = o(l). If Q is a k-generator for X with \Q\ < 

\F n ,k\; an d 

\g n (u^vSi) | > (1-^)1^1, 

where {Si) k =1 is a partition of X into k classes of sizes as equal as possible, 
then provided n is sufficiently large depending on k, we have \Q\ = \J-~ n ,k\ an d 

g = u k =1 vs t . \ {0}. 

Note that n is no longer assumed to be a multiple of k] the case k = 2 
and n odd will be needed in Section [3] 
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Proof. Let Q and (Si)f =l be as in the statement of the proposition. For each 
i G [k], let T{ = (VSi \ {0}) \ Q be the collection of all nonempty subsets of 
Si which are not in Q. By our assumption on Q, we know that \JFi\ < o(2' Si ') 
for each i 6 [k]. Let 

£ = <?\(JP($) 

i=l 

be the collection of 'extra' sets in Q; let \£ \ — M. 

By relabeling the Si's, we may assume that |.Fi| > jj^l > • • • > By 
our assumption on \Q\, M < 

Let 

H = {yi U s 2 U • ■ ■ U s k : y x G Ji, s, C Si Wi > 2}; 

observe that the sets yi U Si U • • • U Sf. are all distinct, so \1Z\ = \J r 1 \2 n ^^ Sl ^ . 
By considering the number of sets in £ needed for Q to fc-generate TZ, we will 
show that M > /c|J-i| unless T\ = 0. (In fact, our argument would also show 
that M > pk\^i\ unless T\ = 0, for any Pk > depending only on k.) 

Let N be the number of sets in 1Z which may be expressed as a disjoint 
union of two sets in £ and at most k — 2 other sets in Q . Then 

fc-2 



..ffltfl 



<4c fc - 2 fc fc |Ji|2»"l*l 
= o(l)|J- 1 |2 n - 1511 

= o(|^|), (12) 

where we have used \Q\ < \T n ^k\ < c k2 n / k (see (J3])), |5i| < \n/k~\, and 
= o(2' Sl l) in the second, third and fourth lines respectively 
Now fix x\ G T\. For j > 1, let Aj(xi) be the collection of (k — l)-tuples 
(s2, • • • , Sk) G VSi x • ■ • x VSk such that 

Xi U S 2 U • • • U Sfc 

may be expressed as a disjoint union 

yi U y 2 U ■ ■ • U y k 

with j/,- G £ but C Si Vz 7^ j. Let -4*(xi) be the collection of (A; — l)-tuples 
(s 2 , • • • , Sk) G VSi x • • • x PSfc such that 

X\ U S2 U • ■ ■ U Sk 
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may be expressed as a disjoint union of two sets in £ and at most k — 2 other 
sets in Q. 

Now fix j^l. For each (s 2 , ■ ■ ■ , Sfc) £ -4j( x i)> we ma y write 

Xx U s 2 U • • • U Sk = s[ U s 2 U • • • U Sj-i U yj U s 3 - +1 U • • • U Sk, 

where yj = Sj U (xx \ s[) G £. Since yj fl = Sj, different s/s correspond to 
different yj's G £ , and so there are at most \£\ — M choices for Sj. Therefore, 

\Mxx)\ < r^'^M < 2 n ^ Sl ^ l k\J c x\ < 2k O^j 2"~ |Sl1 , 
the last inequality following from the fact that \Sj\ > \Sx\ — 1. Hence, 

\M*i)\ < 2k ( k - 1) (M) 2n ' lSl1 = o(l)2 n ~^. (13) 

j=2 ^ ' 

Observe that for each x\ G J-\, 

k 

A*{ Xl ) U [J Ajixx) = VS 2 xVS 3 x ■■■ x VS k , 

3=1 

and therefore 

k 

\A*(xi)\ + + \M*i)\ > 2 n ~ |Sl1 , 

so by (USD, 

1^(^)1 + |A(xi)| > (l-o(l))2^l*l. 

Call x x G T x 'bad' if |-4*0n)| > 2-( fc+2 )2"-l 5l l; otherwise, call x x 'good'. 
By ()12p . at most a o(l)-fraction of the sets in J-i are bad, so at least a 1 — o(l) 
fraction are good. For each good set xx G J~x, notice that 

\Ax(xi)\ > (1 - 2- (fc+2) - o(l))2™~ 1511 . 

Now perform the following process. Choose any (s 2 , ...,Sfc) G Ax(xx)] we 
may write 

xi U s 2 U • ■ ■ U s k = z {l) U s 2 U • - • U 4 

with (s' 2 , . . . , 4) G VS 2 x • • • x VS k , z« G 5, z^D-Si = xi, and ^ 0. 

Pick pi G \ Si. At most ~2 n ~' s ' 1 ' of the members of Ax(xx) have union 
containing px, so there are at least 

(l_!_ 2 -( fc + 2 )-o(l))2"-l 5l l 
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remaining members of Ai(xi). Choose one of these, {t 2 , . . . ,t k ) say. By 
definition, we may write 

x x U t 2 U • • • U t k = z {2) U t' 2 U • • • U t' k 

with (4 . . . , t' k ) G VS 2 x • • • x VS k , G S, HS 1 = x 1 , and z^ \S t ^ 0. 
Since ^ z^ 2 ', we must have z( 2 ) 7^ Pick p 2 G z^ \ Si, and repeat. At 
most |2 n_ ' 51 ' of the members of Ai(xi) have union containing p\ or p 2 ; there 
are at least 

(i _ 2 -( fc + 2 ) - o(l))2"-l 5l l 

members remaining. Choose one of these, (u 2 , . . . , u k ) say. By definition, we 
may write 

x x U u 2 U • • • U u k = z {3) U u 2 U • • • U u' k 

with (u' 2 , . . . , u' k ) G VS 2 x • • ■ x VS k , G S, z< 3 > r\S 1 = x 1 , and z& \S l ^0. 
Note that again is distinct from z^\ z^ 2 \ since pi,p 2 G" z^ 3 * 1 . Continuing 
this process for k + 1 steps, we end up with a collection of A; + 1 distinct sets 
, . . . , G £ such that z® R Si = £1 V/ G [fc + 1] . Do this for each good 

set x\ G T\\ the collections produced are clearly pairwise disjoint. Therefore, 

\S\ > (A; + l)(l-o(l))|Ji|. 

This is a contradiction, unless T\ = 0. Hence, we must have T 2 = • • • = 
T k — 0, and therefore 

= lixW) \ {0}' 
proving Proposition [16], and completing the proof of Theorem [51 □ 



3 The case k = 2 via bipartite subgraphs of 
if. 

Our aim in this section is to prove the k = 2 case of Conjecture [TJ for all 
sufficiently large odd n, which together with the k = 2 case of Theorem [5] 
will imply 

Theorem 4. 7/n is sufficiently large, X is an n-element set, and Q C "PX 
zs a 2- generator for X , then \Q\ > \J-" n ,2\- Equality holds only if Q is of the 
form T n ,2- 

Recall that 

I f 2 • 2 n / 2 - 2 if n is even; 
|>«,2| - | 3 . 2 (n-i)/2_ 2 if n is odd. 
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Suppose that X is an n-element set, and Q C VX is a 2-generator for X 
with |£/| = m < 1^,2 1- The counting argument in the Introduction gives 

l + m+(™) >2», 

which implies that 

|^| > (1 -o(l))v / 22 n/2 . 

For n odd, we wish to improve this bound by a factor of approximately 1.5. 

Our first aim is to prove that induced subgraphs of the Kneser graph H 
which have order f2(2 n / 2 ) are o(l)-close to being bipartite (Proposition [TH]) . 

Recall that a graph G = {V,E) is said to be e-close to being bipartite if 
it can be made bipartite by the removal of at most e\V\ 2 edges, and e-far 
from being bipartite if it requires the removal of at least e\V\ 2 edges to make 
it bipartite. 

Using Szemeredi's Regularity Lemma, Bollobas, Erdos, Simonovits and 
Szemeredi [5] proved the following. 

Theorem 17 (Bollobas, Erdos, Simonovits, Szemeredi). For any e > ; 

there exists g(e) G N depending on e alone such that for any graph G which 
is e-far from being bipartite, the probability that a uniform random induced 
subgraph of G of order g(e) is non-bipartite is at least 1/2. 

Building on methods of Goldreich, Goldwasser and Ron [12], Alon and 
Krivelevich [2] proved without using the Regularity Lemma that in fact, one 
may take 

*) < d4) 

e 

where b > is an absolute constant. As observed in [2], this is tight up to 
the poly-logarithmic factor, since necessarily, 

M > I 

We will first show that for any fixed c > and I 6 N, if A C VX 
with |^| > c2 n/2 , then the density of C^+i's in H[A) is at most o(l). To 
prove this, we will show that for any / 6 N, there exists t G N such that 
for any fixed c > 0, if A C VX with |*4| > c2 n / 2 , then the homomorphism 
density of C 2 i+i <g)i in H[A] is o(l). Using Lemma we will deduce that the 
homomorphism density of C2Z+1 i n H[A] is o(l), implying that the density 
of C 2 /+i's in H[A) is o(l). This will show that H[A] is o(l)-close to being 
bipartite (Proposition [T8|) . To obtain a sharper estimate for the o(l) term 
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in Proposition [TS1 we will use (JHJ), although to prove Theorem HI any o(l) 
term would suffice, so one could in fact use Theorem [T7| instead of (IT4]) . 
We are now ready to prove the following 

Proposition 18. Let c > 0. Then there exists b > such that for any 
A C VX with \A\ > c2 n / 2 , rj/ie induced subgraph H[A] can be made bipartite 
by removing at most 

(logg log 2 77) b 2 
log 2 77 

edges. 

Proo/. Fix c > 0; let A C VX with = m > c2 n/2 . First, we show that 
for any fixed I G N, there exists t G N such that the homomorphism density 
of C 2 i+i <E> t's in fP[-4] is at most o(l). The argument is a strengthening of 
that used by Alon and Frankl to prove Lemma 4.2 in pQ. 

Let t G N to be chosen later. Choose (21 + l)t members of A uniformly at 
random with replacement, (A^ )i<i<2i+i, i<j<t- The homomorphism density 
of C21+1 <8> t in if [.A] is precisely the probability that the unions 

satisfy C/j fl = for each 7 (where the addition is modulo 21 + 1). 

We claim that if this occurs, then \Ui\ < (| — 77)72 for some z, provided 
77 < 1/(4/ + 2). Suppose for a contradiction that Ui fl = for each 
i, and |C/j| > (\ — 77)71 for each 7. Then |£/j +2 \ Z7j| < 77 — \Ui+%\ — \Ui\ < 
2rjn for each i G [2/ — 1]. Since ^+1 \ Ui C Uj =1 (f/ 2 j + i \ U^j-i), we have 

IC/m+i \ < Ej=i l^+i \ f^-il < 2/r/r7. It follows that \Ux n C/ 2m | > 
(1/2 - (2/ + 1)77)77 > if 77 < 1/(4/ + 2), a contradiction. 

We now show that the probability of this event is very small. Fix i G [k]. 
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Observe that 



t 



Pr{|l7,| < (1/2 - rj)n} = Pr ( |J f|{^ C 5} 

, 5cX:|S|<(l/2-r;)n \j=l 



f 



|5|<(l/2-r7)n Y?=l 

= E ( 2l5| /< 

|S|<(l/2-i7)n 

/9(l/2-J7)n ^ * 

< 2 r 



/ 2 (l/2-r,)n\ 

V c2"/ 2 J 



_ 2 _ ( T '' _1 ) n c _ * 
< 2- n c"*, 

provided t > 2/77. Hence, 

(2/+1 \ 2Z+1 

|J{|^| < (1/2-^n} < E Pr {l^l ^ (V2-^M < (21 + 1)2- 
i=l / i=l 

Therefore, 

^ 2i+10t (#[„4]) < (2/ + l)2-»c-*. 
Choose V = ~ti an d t = 2/r] = 161. By Lemma [TJ 



/ iC2m (iy[^])<((2Z + i)2-" c -*; 

= (2/ + i)V(i60 2 ' +1 2 -™/(i60 2i+1 c -i/(i60 2i 
= 0(2" n/(16 ° 2!+1 ). 

Observe that the number of (2s + l)-subsets of A containing an odd cycle 
of H is at most 



X> 2m ^ +1 (#[^]) 



1=1 



m - (21 + 1) 
2(s-l) 



Hence, the probability that a uniform random (2s + l)-subset of A contains 
an odd cycle of H is at most 



m 2Z+l 



£ m(m -l)...(m-2 (2S + 1)(2S) ' ' ' (2(S - " + !> WW) 



< s(2s + 1)!0(2 



-n/(16s) 5 
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(provided s < 0(y/m}). This can be made < 1/2 by choosing 

s = a log 2 nj log 2 log 2 n, 

for some suitable a > depending only on c. By ( THj) . it follows that if |/4.] 
is ((log 2 log 2 n) 6 / log 2 ra)-close to being bipartite, for some suitable b > 
depending only on c, proving the proposition. □ 

Before proving Theorem H] for n odd, we need some more definitions. Let 
X be a finite set. If A C "PX, and i G X, we define 

A," = {i e A : i ^ x}, 

= {x \ {i} : x G A, i G x}; 

these are respectively called the lower and upper i-sections of A. 

If K and Z are disjoint subsets of X, we write if [Y, Z] for the bipartite 
subgraph of the Kneser graph H consisting of all edges between Y and Z. 
If B is a bipartite subgraph of H with vertex-sets K and Z, and J 7 C PX, 
we say that I? 2- generates J 7 if for every set x G J 7 , there exist y G F and 
z <E Z such that y PI z = 0, yz G E(B), and y U z = x, i.e. every set in J 7 
corresponds to an edge of B. 

Proof of Theorem [7/] for n odd. Suppose that n = 21 + 1 > 3 is odd, X is an 
n-element set, and Q C VX is a 2-generator for X with |C/| = m < 1^,21 = 
3 • 2 l - 2. Observe that 

e{H[Q\) > 2 2l+1 - \g\ - 1 > 2 2 ' +1 - 3 ■ 2 l + 1, 

and therefore if has edge-density at least 

2 2i+i _ 3 . 2 * + i 2 2m - 3 - 2 Z + 1 4 

- i(3-2' -2)(3-2'-3) > 5 ' 

(Here, the last inequality rearranges to the statement / > 0.) By Proposition 
[18] applied to Q, we can remove at most 



(log 2 log 2 n) 2 (log 2 log 2 n) 

: y < : 9-2 

log 2 n log 2 n 
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edges from H[Q] to produce a bipartite graph B. Let Y, Z be the vertex- 
classes of B; we may assume that Y U Z = Q. Define e > by 



\{yUz: yeY, z E Z, y D z = ®}\ = (I - e)2 



21+1. 
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then clearly, we have 

e(B) > (1 - e)2 2/+1 . (15) 



Note that 
e < 

Let 



9 (log 2 log 2 n) b + 3 _ 2 _ {m) = Q / (log 2 log 2 = 
2 log 2 n V lo g2^ / 



|y|/2', = \Z\/2 l . 

By assumption, a + /9 < 3 - 2-^ < 3. Since \Y\\Z\ > e{B) > (2 - 2e)2 
we have a/3 > 2 — 2e. This implies that 
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l-2e<a,/3<2 + 2e. (16) 

(To see this, simply observe that to maximize a/3 subject to the conditions 
a < 1 — 2e and a + /3 < 3, it is best to take a = 1 — 2e and /3 = 2 + 2e, giving 
a/3 = 2 — 2e — 4e 2 < 2 — 2e, a contradiction. It follows that we must have 
a > 1 — 2e, so (3 < 2 + 2e; ([IB]) follows by symmetry.) 

From now on, we think of X as the set [n] = {1,2, ... ,n}. Let 

W x = {i G M : > |Y|/3}, 

Vr 2 = {i e [n] : \Zf \ > \Z\/3}. 

First, we prove the following 
Claim 1. W 1 UW 2 = [n]. 

Proof. Suppose for a contradiction that W\ U W 2 ^ [n\. Without loss of 
generality, we may assume that n £ W\ U W 2 - Let 

e = \Y+\/\Yi <p = \z + n \/\z\- 

then we have 6, <p < 1/3. Observe that the number e n of edges between Y 
and Z which generate a set containing n satisfies 

(1 - 2e)2 2 ' < e n < (Oa(l - <p)/3 + 0/3(1 - 6)a)2 21 = (6 + 0- 2^)a/32 2 '. (17) 

(Here, the left-hand inequality comes from the fact that B 2-generates all but 
at most e2 2 ' +1 subsets of [n], and therefore B 2-generates at least (1 — 2e)2 21 
sets containing n.) 

Notice that the function 

f(0, <j))=e + (j)~ 290, < 9, (j) < 1/3 
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is a strictly increasing function of both 9 and for < 9, < 1/3, and 
therefore attains its maximum of 4/9 at 9 = = 1/3. Therefore, 

1 - 2e < fa/3; 

since a + /3 < 3, we have 

3/2 - 3^7/2 < a, (3 < 3/2 + ?>^7j2. 
Moreover, by the AM/GM inequality, a(3 < 9/4, so 

l-2e< §/(M), (18) 

and therefore 

l/3-8e/3 < 0,0 < 1/3. 
Thus \Y\, \Z\ = (3/2 - o(l))2' and 0,0= 1/3 - o(l). Therefore, we have 

=2^(1 -o(l)), 
1^11=2^(1-0(1)), 

|y-| = 2'(n- (i)), 

|Z-| = 2 z (l + o(l)). 

Observe that Q~ = Y~ U Z~ must 2-generate all but at most o(2 2 ') of the 
sets in V{1, 2, . . . , n — 1} = V{1, 2, . . . , 2/}, and therefore, by Proposition M 
for k = 2 and n even, there exists an equipartition Si U S 2 of {1,2,..., 2/} 
such that Y n ~ contains at least (1 — o(l))2 z members of VSi, and Z~ contains 
at least (1 — o(l))2 z members of VS 2 . Define 

U = {yeY: y nS 2 = ®}, 
V = {zeZ : z n Si = 0}. 

Since |[/~| = (l-o(l))2 ; and = (l-o(l))2 z , we must have \Y~\U~\ = 
o(2 l ), and \Z~ \ V~\ = o{2 1 ). Our aim is now to show that \Y+ \ U+\ = o(2 l ), 
znd\Z+\V+\ = o(2 l ). 

Clearly, we have U~ C VS U and V~ C VS 2 , so \U~\ < 2 l and < 2 l . 
Moreover, each set x G F n + \ £7+ contains an element of S 2 , and therefore 
x U {n} is disjoint from at most 2 l ~ l sets in V~ C VS 2 . Similarly, each set 
x G Z+ \ Vn contains an element of Si, and therefore xU{n} is disjoint from 
at most 2 i_1 sets in U~ C VSi. It follows that 

e n <\U+\\V~\ + \Y r t \ U+12 1 ' 1 + \V+\\U-\ + \Z+ \ V+\2 1 ^ 
+ \Y-\U-\\z:\ + \Z-\V-\\Y+\ 
<\U:\2 l + |y+ \ U:\2 1 -" + \V+\2 l + |Z+ \ + o(2 2 '). 
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On the other hand, by (fT7|) . we have e n > (1 — o(l))2 2 '. Since |F n + | = 
2 l -\l - o(l)), and \Z+\ = 2 i " 1 (l - o(l)), we must have \Y+ \ U+\ = o(2 l ), 
and |Z+ \ V„ + | = o(2 ; ), as required. 

We may conclude that \Y\U\= o(2 l ) and \Z \ V\ = o{2 1 ). Hence, there 
are at most o(2 l ) sets in Y U Z = Q that intersect both Si and S 2 . On the 
other hand, since \Y+\ = (1 — o(l))2' _1 and \Z+\ = (1 — o(l))2 z_1 , there are 
at least (1 + o(l))2 i_1 sets s\ C S\ such that S\ U {n} ^ Y, and there are 
at least (1 + o(l))2 z_1 sets s 2 C S2 such that S2 U {n} ^ Z. Taking all pairs 
Si, s 2 gives at least (1 + o(l))2 2l ~ 2 sets of the form 

{n}UsiUs 2 (si CS U siU{n} £Y, s 2 C S 2 , s 2 U{n} g Z). (19) 

Each of these requires a set intersecting both S\ and S2 to express it as 
a disjoint union of two sets from Q. Since there are o(2 l ) members of Q 
intersecting both S\ and 62, Q generates at most 

(\G\ + 1)0(2') = o(2 21 ) 
sets of the form ( Fl9|) . a contradiction. This proves the claim. □ 

We now prove the following 
Claim 2. W 1 f]W 2 = 0. 

Proof. Suppose for a contradiction that W\ D W 2 7^ 0. Without loss of 
generality, we may assume that n G Wi D W^- As before, let 

^ = |K+|/|r|, = |3+|/|Z|; 

this time, we have 0,<p> 1/3. Observe that 

(2 - 2e)2 2i < e(B) < (1 - #0)a/32 2i . (20) 

Here, the left-hand inequality is (Tl5l) . and the right-hand inequality comes 
from the fact that there are no edges between pairs of sets (y, z) G Y x Z 
such that n G y R z. Since 1 — 8<p < 8/9, we have 

2 - 2e < fa/3. 

Since a + /3 < 3, it follows that 

|(l-v^)<a,/3<f(l + v^)- 
Since a/3 < 9/4, we have 

2 -2e< j(l-ty), 
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and therefore 

1/3 < 9,(j) < l/3 + 8e/3. 

Hence, we have 

\Y+\ =2'" 1 (1 -o(l)), 
|^|=2«(1- (1)), 
|F n -|=2'(l + o(l)), 
|Z-|=2 ; (l + o(l)), 

so exactly as in the proof of Claim [TJ we obtain a contradiction. □ 

Claims 1 and 2 together imply that VF1UW2 is a partition of {1, 2, . . . , n} = 
{1, 2, . . . , 21 + 1}. We will now show that at least a (2/3 — o(l))-fraction of 
the sets in Y are subsets of Wi, and similarly at least a (2/3 — o(l))-fraction 
of the sets in Z are subsets of W 2 - Let 

\Y\V{W X )\ \Z\V(W 2 )\ 
a= \Y\ ' T = \Z\ • 

Let y G Y \ VWi, and choose i G y fl H^; since at least |Z|/3 of the sets 
in Z contain i, y has at most 2\Z\/3 neighbours in Z. Hence, 

(2-2e)2 21 < e(B) < (faa/3 + (l-a)a(3)2 21 = (1 -a/3)a(32 21 < (l-a/3)p 21 , 

(21) 

and therefore 

a < l/3 + 8e/3, 

so 

\YC]V(W 1 )\>(2/3-8e/3)\Y\. (22) 

Similarly, r < 1/3 + 8e/3, and therefore \Z n V{W 2 )\ > (2/3 - 8e/3)|Z|. 
If < I- 1, then |y nP(Wi)| < 2'" 1 , so 

2^—i 2' 

|y| < — t = ! < (1 - 2e)2 z , 

1 1 ~ 2/3-8e/3 4 l-4e V ; ' 

contradicting ffT6j) . Hence, we must have \W%\ > i. Similarly, IW2I > ^, so 
{|Wi|, IW^I} = {/,/ + 1}. Without loss of generality, we may assume that 
=1 and |W 2 | = 1 + 1. 
We now observe that 

\Z\ > (3/2-6e)2* (23) 
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To see this, suppose that \Z\ = (3/2 — rf)2 l . Since \Z\ -\-\Y\ < 3 ■ 2 l , we 
have \Y\ < (3/2 + rj)2 l . Recall that any y e Y \ VW 1 has at most 2|Z|/3 
neighbours in Z. Thus, we have 

(2 - 2e)2 2i < e{B) 

< \YCiVW x \\Z\ + \Y\VW x \l\Z\ 
<2'(|-r ? )2 / + (i + r / )2 i |(|-r / )2' 
= (2 - \ V - \r?)2*. 

Therefore rj < 6e, i.e. \Z\ > (3/2 — Qe)2 l , as claimed. Since \Z\ + \Y\ < 3 • 2 l , 
we have 

1*1 < (3/2 + 6e)2*. (24) 

We now prove the following 

Claim 3. (a) |P(Wi) \ Y \ < 22e2 l ; 
(b) \Z\VW 2 \ < (^+2e)2 l . 

Proof. We prove this by constructing another bipartite subgraph B 2 of H 
with the same number of vertices as B, and comparing e(B 2 ) with e(B). 
First, let 

D = min{|P(^ 2 ) \Z\,\Z\ VW 2 \}, 

add D new members of ViyV^) \Z to Z, and delete D members of Z \ VW 2 , 
producing a new set Z' and a new bipartite graph B x = H[Y,Z']. Since 
\Z'\ = \Z\ < (2 + 2e)2 ; , we have \Z' \ VW 2 \ < e2 l+ \ i.e. Z' is almost 
contained within VW 2 . Notice that every member z G Z \ VW 2 had at 
most 2|y|/3 neighbours in Y, and every new member of Z' has at least 
\Y n V(Wi)\ > (2/3 - 8e/3)|F| neighbours in Y, using ([22]). Hence, 

e(B) - e(Bi) < %\Y\D < f \Y\\\Z\ < ^p 21 = 4e2 2/ , 

and therefore 

e{B x ) > e(B) - 2e2 2 ' +1 > (1 - 3e)2 2m 

Second, let 

C = mm{\VW 1 \Y\,\Y\ VW X \}, 

add C new members of V(W\) \ Y to Y, and delete C members of Y \ VW X , 
producing a new set Y 1 and a new bipartite graph B 2 = H[Y',Z']. Since 
\Y\ > (l-2e)2 l , we have |Y'n7Wi| > (l-2e)2 l . Since every deleted member 
of Y contained an element of W 2 , it had at most (1 + 2e)2' neighbours in 
Z' . (Indeed, such member of Y intersects 2 l sets in VW 2 , so has at most 2 l 
neighbours in Z' fl VW 2 ; there are \Z' \ VW 2 \ < t2 l+1 other sets in Z' .) On 
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the other hand, every new member of Y' is joined to all of Z' fl VW 2 , which 
has size at least \Z H VW 2 \ > (3/2 - 8e)2 ? . It follows that 

e{B 2 ) > e(Bi) + C(| - 10e)2 ; > (1 - 3e)2 2m + C(± - I0e)2 l . (25) 

We now show that e(B 2 ) < (1 + e)2 2m . If > 2 l , then write = 
(1 + 0)2' where <\> > 0; F' contains all of TWi, and (f)2 l 'extra' sets. We have 
\Z'\ < (2-0)2', and therefore by ([23]), <p < 1/2 + 6e < 1. Note that every 
'extra' set in Y' \ VW\ has at most 2 l neighbors in VW2, and therefore at 
most (1 + 2e)2' neighbours in Z'. Hence, 



e(B 2 ) < 2\2 - 4>)2 l + 02' (1 + 2e)2' = (1 + 0e)2 2m < (1 + e)2 2 ' +1 . 

If, on the other hand, \Y'\ < 2 l , then since \Y' \ + \Z'\ < 3 • 2 l , we have 
e{B 2 ) < \Y'\\Z'\ < 2 2l+1 . Hence, we always have 

e{B 2 ) < (l + e)2 2m . (26) 

Combining f[23|) and fl2"B"l) . we see that 

8e 

C < —. 2 l < 20e2 l , 

~ 1/2 - lOe ~ 

provided e < 1/100. 

This implies (a). Indeed, if \VWi \ Y\ < C < 20e2', then we are done. 
Otherwise, by the definition of C, we have \Y \ VW\\ < 20e2 l . Recall that 
by (USD, \Y\ > (1 - 2e)2', and therefore 

\Y n VW X \ = \Y\ -\Y\ VWi\ > (1 - 2e)2' - 20e2' = (1 - 22e)2'. 

Hence, 

\V{W X ) \Y\< 22e2 l , (27) 

proving (a). 

Since e(B) > (1 - e)2 2 ' +1 , e(B 2 ) < (1 + e)2 2i+1 , and e(B 2 ) > e(Si), we 
have 

e(Bi) - e(B) < e(B 2 ) - e(B) < (1 + e)2 2 ' +1 - (1 - e)2 2 ' +1 = e2 2i+2 (28) 

We now use this to show that 

D = mm{\V(W 2 ) \Z\,\Z\ VW 2 \} < ^l2 l . 

Suppose for a contradiction that D > y / e2 i ; then it is easy to see that 
there must exist z G Z \ VW 2 with at least 

2|r|/3-8v^2 z 
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neighbours in Y. Indeed, suppose that every z G Z \ VW 2 has less than 
2 1 V |/3 — 8-y/e2* neighbors in Y . Recall that every new member of Z' has at 
least (2/3 — 8e)|F| neighbours in Y . Hence, 

e{B 1 ) - e(B) > 8D(^-e)\Y\ > 8y/e2 l (^/e - e)(l - 2e)2* > e2 2l+1 

since e < 1/16, contradicting (128]) . 

Hence, we may choose z G Z \ VW 2 with at least 

2|F|/3-8 v / i2 / 

neighbours in Y . Without loss of generality, we may assume that n G zflVKi; 
then none of these neighbours can contain n. Hence, Y contains at most 

|F|/3 + 8v^2' 

sets containing n. But by (127|) . Y contains at least (1 — 44e)2 z_1 of the subsets 
of Wi that contain n, and therefore \Y\ > (3/2 - o(l))2 l . By ((23]), it follows 
that \Y\ = (3/2-o(l))2 / and \Z\ = (3/2+o(l))2', so Y contains (l-o(l))2 i " 1 
sets containing n. Hence, by f TT8]) . so does Z. As in the proof of Claim [1] we 
obtain a contradiction. This implies that 

D = mm{\V{W 2 ) \Z\,\Z\ VW 2 \} < \T&\ 

as desired. 

This implies (b). Indeed, if \Z \ VW 2 \ < ^/e2\ then we are done. Other- 
wise, by the definition of D, \V(W 2 ) \ Z\ < y/e2 l , and therefore 

\znvw 2 \ > (2 - v^)2 z . 

Since \Z\ < (2 + 2e)2 l , we have 

\Z\VW 2 \ = \Z\ - \Zr\VW 2 \ < (2 + 2e)2 l - {2-^fe)2 l = ( v ^ + 2e)2 l , 
proving (b). □ 

We conclude by proving the following 
Claim 4. 

\V(W 2 ) \Z\< A^l2 l . 

Proof. Let 

JT 2 = V(W 2 ) \ z 

be the collection of sets in VW 2 which are missing from Z, and let 

£ 1 = Y\ VW X 
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be the set of 'extra' members of Y. 

Since Q is a 2-generator for X, we can express all (J-^^ sets of the form 

w 1 uf 2 ( Wl cW u f 2 e F 2 ) 

as a disjoint union of two sets in Q. All but at most e2 2l+l of these unions 
correspond to edges of B. Since \Z \ VW 2 \ < (y/e + 2e)2 z , there are at most 
(y/e + 2e)2 l \Y\ edges of B meeting sets in Z \ VW 2 . Call these edges of B 
'bad', and the rest of the edges of B 'good'. Fix / 2 £ Jj; we can express all 
2 l sets of the form 

Wl U f 2 (twj C Wt) 

as a disjoint union of two sets in Q. If w\ U f 2 is represented by a good edge, 
then we may write 

wx U / 2 = yi U u> 2 

where 7/1 G £1 with yi PI W\ = Wi, and w 2 C so for every such Wi, 
there is a different y a G fi. By (JMD, |F| < (3/2 + 6e)2 l , and by (ETJ), 
\YH VW X \ > {\-22e)2\ so 

|^| = \y\ - \r{W x ) n y| < (3/2 + 6e)2' - (1 - 22e)2' = (1/2 + 28e)2*. 

Thus, for any f 2 G F 2 , at most (1/2 + 28e)2' unions of the form W\ U f 2 
correspond to good edges of B. All the other unions are generated by bad 
edges of B or are not generated by B at all, so 

(1/2 - 28e)2'| 7 2 \ < (2e + yR)2 l \Y\ + e2 2l+1 . 

Since \Y\ < (3/2 + 6e)2 l and e is small, {J 7 ^ < 4-y/e2 z , as required. □ 

We now know that Y contains all but at most o(2 l ) of VWi, and Z 
contains all but at most o{2 1 ) of VW 2 . Since + < 3-2', we may conclude 
that \Y\ = (1 -o(l))2* and \Z\ = (2-o(l))2'. It follows from Proposition US 
that provided n is sufficiently large, we must have Q = V(Wi) UT(W 2 ) \ {0}, 
completing the proof of Theorem HI □ 

4 Conclusion 

We have been unable to prove Conjecture [1] for k > 3 and all sufficiently 
large n. Recall that if Q is a fc-generator for an n-element set X, then 

\g\ > 2 n/k . 
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In view of Proposition [TBI it is natural to ask whether for any fixed k, all 
induced subgraphs of the Kneser graph H with Vt{2 n ^ k ) vertices can be made 
fc-partite by removing at most o(2 2n//fc ) edges. This is false for k = 3, however, 
as the following example shows. Let n be a multiple of 6, and take an 
equipartition of [n] into 6 sets T%, . . . , of size n/6. Let 

A= |J (T.UT,); 
{ij}e[6](2) 

then |.A| = 15(2 n//3 ), and i?[y4] contains a 2 n / 3 -blow-up of the Kneser graph 
K(6, 2), which has chromatic number 4. It is easy to see that H[A] requires 
the removal of at least 2 2n//3 edges to make it tripartite. Hence, a different 
argument to that in Section 3 will be required. 

We believe Conjecture [TJ to be true for all n and k, but it would seem 
that different techniques will be required to prove this. 
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